Analysis of GLDS-208 from NASA GeneLab

This R markdown file was auto-generated by the iDEP website Using iDEP 0.91, originally by Steven

Ge SX, Son EW, Yao R: iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinformatics 2018, 19(1):534. PMID:30567491

1. Read data

First we set up the working directory to where the files are saved.

 setwd('~/Documents/HTML_R/GLDS208_RNASEQ')

R packages and iDEP core Functions. Users can also download the iDEP_core_functions.R file. Many R packages needs to be installed first. This may take hours. Each of these packages took years to develop.So be a patient thief. Sometimes dependencies needs to be installed manually. If you are using an older version of R, and having trouble with package installation, try un-install the current version of R, delete all folders and files (C:/Program Files/R/R-3.4.3), and reinstall from scratch.

 if(file.exists('iDEP_core_functions.R'))
    source('iDEP_core_functions.R') else 
    source('https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/iDEP_core_functions.R') 

We are using the downloaded gene expression file where gene IDs has been converted to Ensembl gene IDs. This is because the ID conversion database is too large to download. You can use your original file if your file uses Ensembl ID, or you do not want to use the pathway files available in iDEP (or it is not available).

 inputFile <- 'GLDS208_Expression.csv'
 sampleInfoFile <- 'GLDS208_Sampleinfo.csv'
 gldsMetadataFile <- 'GLDS208_Metadata.csv'
 geneInfoFile <- 'Arabidopsis_thaliana__athaliana_eg_gene_GeneInfo.csv' #Gene symbols, location etc. 
 geneSetFile <- 'Arabidopsis_thaliana__athaliana_eg_gene.db'  # pathway database in SQL; can be GMT format 
 STRING10_speciesFile <- 'https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/STRING10_species.csv' 

Parameters for reading data

 input_missingValue <- 'geneMedian' #Missing values imputation method
 input_dataFileFormat <- 1  #1- read counts, 2 FKPM/RPKM or DNA microarray
 input_minCounts <- 0.5 #Min counts
 input_NminSamples <- 1 #Minimum number of samples 
 input_countsLogStart <- 4  #Pseudo count for log CPM
 input_CountsTransform <- 1 #Methods for data transformation of counts. 1-EdgeR's logCPM 2-VST, 3-rlog 
readMetadata.out <- readMetadata(gldsMetadataFile)
library(knitr)   #  install if needed. for showing tables with kable
library(kableExtra)
kable( readMetadata.out ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%")
Atha_Col0_Root_ZoneI_Rep1 Atha_Col0_Root_ZoneI_Rep2 Atha_Col0_Root_ZoneI_Rep3 Atha_Col0_Root_ZoneII_Rep1 Atha_Col0_Root_ZoneII_Rep2 Atha_Col0_Root_ZoneII_Rep3
Sample.LongId Atha.Col.0.Root.Zone.I.Rep1.RNAseq.RNAseq Atha.Col.0.Root.Zone.I.Rep2.RNAseq.RNAseq Atha.Col.0.Root.Zone.I.Rep3.RNAseq.RNAseq Atha.Col.0.Root.Zone.II.Rep1.RNAseq.RNAseq Atha.Col.0.Root.Zone.II.Rep2.RNAseq.RNAseq Atha.Col.0.Root.Zone.II.Rep3.RNAseq.RNAseq
Sample.Id Atha.Col.0.Root.Zone.I.Rep1 Atha.Col.0.Root.Zone.I.Rep2 Atha.Col.0.Root.Zone.I.Rep3 Atha.Col.0.Root.Zone.II.Rep1 Atha.Col.0.Root.Zone.II.Rep2 Atha.Col.0.Root.Zone.II.Rep3
Sample.Name Atha_Col-0_Root_Zone-I_Rep1 Atha_Col-0_Root_Zone-I_Rep2 Atha_Col-0_Root_Zone-I_Rep3 Atha_Col-0_Root_Zone-II_Rep1 Atha_Col-0_Root_Zone-II_Rep2 Atha_Col-0_Root_Zone-II_Rep3
GLDS 208 208 208 208 208 208
Accession GLDS-208 GLDS-208 GLDS-208 GLDS-208 GLDS-208 GLDS-208
Hardware Petri dish Petri dish Petri dish Petri dish Petri dish Petri dish
Tissue Meristem zone 1 Meristem zone 1 Meristem zone 1 Elongation zone 2 Elongation zone 2 Elongation zone 2
Age 8 days 8 days 8 days 8 days 8 days 8 days
Organism Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana
Ecotype Col-0 Col-0 Col-0 Col-0 Col-0 Col-0
Genotype WT WT WT WT WT WT
Variety Col-0 WT Col-0 WT Col-0 WT Col-0 WT Col-0 WT Col-0 WT
Radiation Background Earth Background Earth Background Earth Background Earth Background Earth Background Earth
Gravity Terrestrial Terrestrial Terrestrial Terrestrial Terrestrial Terrestrial
Developmental 8 day old seedlings root tip meristem and elongation zone 8 day old seedlings root tip meristem and elongation zone 8 day old seedlings root tip meristem and elongation zone 8 day old seedlings root tip meristem and elongation zone 8 day old seedlings root tip meristem and elongation zone 8 day old seedlings root tip meristem and elongation zone
Time.series.or.Concentration.gradient Single time point Single time point Single time point Single time point Single time point Single time point
Light White light White light White light White light White light White light
Assay..RNAseq. RNAseq Transcription Profiling RNAseq Transcription Profiling RNAseq Transcription Profiling RNAseq Transcription Profiling RNAseq Transcription Profiling RNAseq Transcription Profiling
Temperature 18-20 18-20 18-20 18-20 18-20 18-20
Treatment.type Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray) Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray) Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray) Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray) Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray) Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles (microarray)
Treatment.intensity x x x x x x
Treament.timing x x x x x x
Preservation.Method. RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater
 readData.out <- readData(inputFile) 
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
   kable( head(readData.out$data) ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Atha_Col0_Root_ZoneI_Rep1 Atha_Col0_Root_ZoneI_Rep2 Atha_Col0_Root_ZoneI_Rep3 Atha_Col0_Root_ZoneII_Rep1 Atha_Col0_Root_ZoneII_Rep2 Atha_Col0_Root_ZoneII_Rep3
AT1G75750 14.88189 14.96816 14.53326 8.50174 5.264568 2.796958
AT5G56540 14.27907 15.06916 14.87850 10.96798 10.427013 9.505186
AT1G62480 14.77619 15.18363 15.10583 13.56028 12.948680 12.509942
AT4G34050 14.81464 14.78146 14.56046 11.06152 10.170371 10.517753
AT1G21310 15.07387 14.56679 14.81694 12.49484 12.237116 12.634518
AT2G02130 14.77509 14.74724 14.47653 11.70054 10.777718 10.785445
 readSampleInfo.out <- readSampleInfo(sampleInfoFile) 
 kable( readSampleInfo.out ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Tissue
Atha_Col0_Root_ZoneI_Rep1 Meristem zone 1
Atha_Col0_Root_ZoneI_Rep2 Meristem zone 1
Atha_Col0_Root_ZoneI_Rep3 Meristem zone 1
Atha_Col0_Root_ZoneII_Rep1 Elongation zone 2
Atha_Col0_Root_ZoneII_Rep2 Elongation zone 2
Atha_Col0_Root_ZoneII_Rep3 Elongation zone 2
 input_selectOrg ="NEW" 
 input_selectGO <- 'GOBP'   #Gene set category 
 input_noIDConversion = TRUE  
 allGeneInfo.out <- geneInfo(geneInfoFile) 
 converted.out = NULL 
 convertedData.out <- convertedData()    
 nGenesFilter()  
## [1] "16156 genes in 6 samples. 16156  genes passed filter.\n Original gene IDs used."
 convertedCounts.out <- convertedCounts()  # converted counts, just for compatibility 

2. Pre-process

# Read counts per library 
 parDefault = par() 
 par(mar=c(12,4,2,2)) 
 # barplot of total read counts
 x <- readData.out$rawCounts
 groups = as.factor( detectGroups(colnames(x ) ) )
 if(nlevels(groups)<=1 | nlevels(groups) >20 )  
  col1 = 'green'  else
  col1 = rainbow(nlevels(groups))[ groups ]             
         
 barplot( colSums(x)/1e6, 
        col=col1,las=3, main="Total read counts (millions)")  

 readCountsBias()  # detecting bias in sequencing depth 
## [1] 0.01124929
## [1] 0.01124929
## [1] "Warning! Sequencing depth bias detected. Total read counts are significantly different among sample groups (p= 1.12e-02 ) based on ANOVA.  Total read counts seem to be correlated with factor Tissue (p= 1.12e-02 ).  "
 # Box plot 
 x = readData.out$data 
 boxplot(x, las = 2, col=col1,
    ylab='Transformed expression levels',
    main='Distribution of transformed data') 

 #Density plot 
 par(parDefault) 
## Warning in par(parDefault): graphical parameter "cin" cannot be set
## Warning in par(parDefault): graphical parameter "cra" cannot be set
## Warning in par(parDefault): graphical parameter "csi" cannot be set
## Warning in par(parDefault): graphical parameter "cxy" cannot be set
## Warning in par(parDefault): graphical parameter "din" cannot be set
## Warning in par(parDefault): graphical parameter "page" cannot be set
 densityPlot()       

 # Scatter plot of the first two samples 
 plot(x[,1:2],xlab=colnames(x)[1],ylab=colnames(x)[2], 
    main='Scatter plot of first two samples') 

 ####plot gene or gene family
 input_selectOrg ="BestMatch" 
 input_geneSearch <- 'HOXA' #Gene ID for searching 
 genePlot()  
## NULL
 input_useSD <- 'FALSE' #Use standard deviation instead of standard error in error bar? 
 geneBarPlotError()       
## NULL

3. Heatmap

 # hierarchical clustering tree
 x <- readData.out$data
 maxGene <- apply(x,1,max)
 # remove bottom 25% lowly expressed genes, which inflate the PPC
 x <- x[which(maxGene > quantile(maxGene)[1] ) ,] 
 plot(as.dendrogram(hclust2( dist2(t(x)))), ylab="1 - Pearson C.C.", type = "rectangle") 

 #Correlation matrix
 input_labelPCC <- TRUE #Show correlation coefficient? 
 correlationMatrix() 

 # Parameters for heatmap
 input_nGenes <- 1000   #Top genes for heatmap
 input_geneCentering <- TRUE    #centering genes ?
 input_sampleCentering <- FALSE #Center by sample?
 input_geneNormalize <- FALSE   #Normalize by gene?
 input_sampleNormalize <- FALSE #Normalize by sample?
 input_noSampleClustering <- FALSE  #Use original sample order
 input_heatmapCutoff <- 4   #Remove outliers beyond number of SDs 
 input_distFunctions <- 1   #which distant funciton to use
 input_hclustFunctions <- 1 #Linkage type
 input_heatColors1 <- 1 #Colors
 input_selectFactorsHeatmap <- 'Tissue' #Sample coloring factors 
 png('heatmap.png', width = 10, height = 15, units = 'in', res = 300) 
 staticHeatmap() 
 dev.off()  
## png 
##   2

[heatmap] (heatmap.png)

 heatmapPlotly() # interactive heatmap using Plotly 

4. K-means clustering

 input_nGenesKNN <- 2000    #Number of genes fro k-Means
 input_nClusters <- 4   #Number of clusters 
 maxGeneClustering = 12000
 input_kmeansNormalization <- 'geneMean'    #Normalization
 input_KmeansReRun <- 0 #Random seed 

 distributionSD()  #Distribution of standard deviations 

 KmeansNclusters()  #Number of clusters 

 Kmeans.out = Kmeans()   #Running K-means 
 KmeansHeatmap()   #Heatmap for k-Means 

 #Read gene sets for enrichment analysis 
 sqlite  <- dbDriver('SQLite')
 input_selectGO3 <- 'GOBP'  #Gene set category
 input_minSetSize <- 15 #Min gene set size
 input_maxSetSize <- 2000   #Max gene set size 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO3,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 # Alternatively, users can use their own GMT files by
 #GeneSets.out <- readGMTRobust('somefile.GMT')  
 results <- KmeansGO()  #Enrichment analysis for k-Means clusters   
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Cluster adj.Pval Genes Pathways
A 9.41e-229 253 Amide biosynthetic process
4.69e-228 244 Translation
1.31e-227 244 Peptide biosynthetic process
6.81e-221 247 Peptide metabolic process
1.32e-216 258 Cellular amide metabolic process
4.79e-180 290 Organonitrogen compound biosynthetic process
1.36e-129 153 Ribonucleoprotein complex biogenesis
2.34e-110 127 Ribosome biogenesis
3.94e-89 185 Cellular component biogenesis
9.73e-79 66 Ribosomal large subunit biogenesis
B 7.37e-49 128 Oxidation-reduction process
6.91e-39 127 Small molecule metabolic process
8.49e-39 83 Drug metabolic process
1.97e-30 62 Nucleobase-containing small molecule metabolic process
1.29e-28 122 Response to abiotic stimulus
4.83e-27 77 Response to inorganic substance
9.37e-26 54 Generation of precursor metabolites and energy
2.28e-25 54 Response to metal ion
3.17e-25 34 ATP metabolic process
3.17e-25 44 Purine-containing compound metabolic process
C 2.40e-21 36 Cell wall organization or biogenesis
6.26e-21 32 Cell wall organization
2.20e-20 32 External encapsulating structure organization
1.12e-13 20 Plant-type cell wall organization or biogenesis
2.77e-10 15 Drug catabolic process
3.26e-10 11 Hydrogen peroxide catabolic process
7.32e-10 12 Phenylpropanoid metabolic process
1.97e-09 11 Antibiotic catabolic process
2.17e-09 13 Plant-type cell wall organization
2.17e-09 15 Cell wall biogenesis
D 5.09e-16 30 Secondary metabolic process
2.61e-13 18 Phenylpropanoid metabolic process
6.94e-13 15 Phenylpropanoid biosynthetic process
6.94e-13 19 Secondary metabolite biosynthetic process
1.57e-12 14 Lignin metabolic process
1.78e-10 11 Lignin biosynthetic process
2.79e-10 53 Response to oxygen-containing compound
3.61e-09 58 Response to organic substance
5.56e-09 60 Response to abiotic stimulus
5.56e-09 37 Response to inorganic substance
 input_seedTSNE <- 0    #Random seed for t-SNE
 input_colorGenes <- TRUE   #Color genes in t-SNE plot? 
 tSNEgenePlot()  #Plot genes using t-SNE 

5. PCA and beyond

 input_selectFactors <- 'Tissue'    #Factor coded by color
 input_selectFactors2 <- 'Sample_Name'  #Factor coded by shape
 input_tsneSeed2 <- 0   #Random seed for t-SNE 
 #PCA, MDS and t-SNE plots
 PCAplot()  

 MDSplot() 

 tSNEplot()  

 #Read gene sets for pathway analysis using PGSEA on principal components 
 input_selectGO6 <- 'GOBP' 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO6,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 PCApathway() # Run PGSEA analysis 
## Warning: Package 'KEGG.db' is deprecated and will be removed from Bioconductor
##   version 3.12

 cat( PCA2factor() )   #The correlation between PCs with factors 
## 
##  Correlation between Principal Components (PCs) with factors
## PC1 is correlated with Tissue (p=1.29e-04).

6. DEG1

 input_CountsDEGMethod <- 3 #DESeq2= 3,limma-voom=2,limma-trend=1 
 input_limmaPval <- 0.1 #FDR cutoff
 input_limmaFC <- 2 #Fold-change cutoff
 input_selectModelComprions <- 'Tissue: Meristem zone 1 vs. Elongation zone 2'  #Selected comparisons
 input_selectFactorsModel <- 'Tissue'   #Selected comparisons
 input_selectInteractions <- NULL   #Selected comparisons
 input_selectBlockFactorsModel <- NULL  #Selected comparisons
 factorReferenceLevels.out <- c('Tissue:Meristem zone 1') 

 limma.out <- limma()
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
 DEG.data.out <- DEG.data()
 limma.out$comparisons 
## [1] "Meristem zone 1-Elongation zone 2"
 input_selectComparisonsVenn = limma.out$comparisons[1:3] # use first three comparisons
 input_UpDownRegulated <- FALSE #Split up and down regulated genes 
 vennPlot() # Venn diagram 

  sigGeneStats() # number of DEGs as figure 

  sigGeneStatsTable() # number of DEGs as table 
##                                                         Comparisons   Up Down
## Meristem-zone-1-Elongation-zone-2 Meristem-zone-1-Elongation-zone-2 4621 3642

7. DEG2

 input_selectContrast <- 'Meristem zone 1-Elongation zone 2'    #Selected comparisons 
 selectedHeatmap.data.out <- selectedHeatmap.data()
 selectedHeatmap()   # heatmap for DEGs in selected comparison
## Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of a vector type, was 'NULL'
 # Save gene lists and data into files
 write.csv( selectedHeatmap.data()$genes, 'heatmap.data.csv') 
 write.csv(DEG.data(),'DEG.data.csv' )
 write(AllGeneListsGMT() ,'AllGeneListsGMT.gmt')
 input_selectGO2 <- 'GOBP'  #Gene set category 
 geneListData.out <- geneListData()  
 volcanoPlot()  

  scatterPlot()  

  MAplot()  

  geneListGOTable.out <- geneListGOTable()  
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO2,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_removeRedudantSets <- TRUE   #Remove highly redundant gene sets? 
 results <- geneListGO()  #Enrichment analysis
## Error in if (dim(results1)[2] == 1) return(results1) else {: argument is of length zero
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Cluster adj.Pval Genes Pathways
A 9.41e-229 253 Amide biosynthetic process
4.69e-228 244 Translation
1.31e-227 244 Peptide biosynthetic process
6.81e-221 247 Peptide metabolic process
1.32e-216 258 Cellular amide metabolic process
4.79e-180 290 Organonitrogen compound biosynthetic process
1.36e-129 153 Ribonucleoprotein complex biogenesis
2.34e-110 127 Ribosome biogenesis
3.94e-89 185 Cellular component biogenesis
9.73e-79 66 Ribosomal large subunit biogenesis
B 7.37e-49 128 Oxidation-reduction process
6.91e-39 127 Small molecule metabolic process
8.49e-39 83 Drug metabolic process
1.97e-30 62 Nucleobase-containing small molecule metabolic process
1.29e-28 122 Response to abiotic stimulus
4.83e-27 77 Response to inorganic substance
9.37e-26 54 Generation of precursor metabolites and energy
2.28e-25 54 Response to metal ion
3.17e-25 34 ATP metabolic process
3.17e-25 44 Purine-containing compound metabolic process
C 2.40e-21 36 Cell wall organization or biogenesis
6.26e-21 32 Cell wall organization
2.20e-20 32 External encapsulating structure organization
1.12e-13 20 Plant-type cell wall organization or biogenesis
2.77e-10 15 Drug catabolic process
3.26e-10 11 Hydrogen peroxide catabolic process
7.32e-10 12 Phenylpropanoid metabolic process
1.97e-09 11 Antibiotic catabolic process
2.17e-09 13 Plant-type cell wall organization
2.17e-09 15 Cell wall biogenesis
D 5.09e-16 30 Secondary metabolic process
2.61e-13 18 Phenylpropanoid metabolic process
6.94e-13 15 Phenylpropanoid biosynthetic process
6.94e-13 19 Secondary metabolite biosynthetic process
1.57e-12 14 Lignin metabolic process
1.78e-10 11 Lignin biosynthetic process
2.79e-10 53 Response to oxygen-containing compound
3.61e-09 58 Response to organic substance
5.56e-09 60 Response to abiotic stimulus
5.56e-09 37 Response to inorganic substance

STRING-db API access. We need to find the taxonomy id of your species, this used by STRING. First we try to guess the ID based on iDEP’s database. Users can also skip this step and assign NCBI taxonomy id directly by findTaxonomyID.out = 10090 # mouse 10090, human 9606 etc.

 STRING10_species = read.csv(STRING10_speciesFile)  
 ix = grep('Arabidopsis thaliana', STRING10_species$official_name ) 
 findTaxonomyID.out <- STRING10_species[ix,1] # find taxonomyID
 findTaxonomyID.out  
## [1] 3702

Enrichment analysis using STRING

  STRINGdb_geneList.out <- STRINGdb_geneList() #convert gene lists
## Warning:  we couldn't map to STRING 0% of your identifiers
 input_STRINGdbGO <- 'Process'  #'Process', 'Component', 'Function', 'KEGG', 'Pfam', 'InterPro' 
 results <- stringDB_GO_enrichmentData()  # enrichment using STRING 
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
x
NULL

PPI network retrieval and analysis

 input_nGenesPPI <- 100 #Number of top genes for PPI retrieval and analysis 
 stringDB_network1(1) #Show PPI network 

Generating interactive PPI

 write(stringDB_network_link(), 'PPI_results.html') # write results to html file 
## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")
## Warning:  we couldn't map to STRING 0% of your identifiers
## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")

## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")
 browseURL('PPI_results.html') # open in browser 

8. Pathway analysis

 input_selectContrast1 <- 'Meristem zone 1-Elongation zone 2'   #select Comparison 
 #input_selectContrast1 = limma.out$comparisons[3] # manually set
 input_selectGO <- 'GOBP'   #Gene set category 
 #input_selectGO='custom' # if custom gmt file
 input_minSetSize <- 15 #Min size for gene set
 input_maxSetSize <- 2000   #Max size for gene set 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_pathwayPvalCutoff <- 0.2 #FDR cutoff
 input_nPathwayShow <- 30   #Top pathways to show
 input_absoluteFold <- FALSE    #Use absolute values of fold-change?
 input_GenePvalCutoff <- 1  #FDR to remove genes 

 input_pathwayMethod = 1  # 1  GAGE
 gagePathwayData.out <- gagePathwayData()  # pathway analysis using GAGE  
   
 results <- gagePathwayData.out  #Enrichment analysis for k-Means clusters  
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Direction GAGE analysis: Meristem zone 1 vs Elongation zone 2 statistic Genes adj.Pval
Down Ribonucleoprotein complex biogenesis -14.442 438 1.9e-37
Ribosome biogenesis -13.9246 343 5.8e-34
RNA modification -12.523 321 1.8e-29
NcRNA metabolic process -12.5069 425 1.0e-29
NcRNA processing -12.0699 357 1.6e-27
RRNA processing -11.3138 239 8.3e-23
RRNA metabolic process -11.215 244 1.2e-22
Ribosomal large subunit biogenesis -8.6084 99 4.2e-12
MRNA processing -8.53 378 1.9e-14
Chromatin organization -8.4884 384 1.9e-14
Ribonucleoprotein complex assembly -8.3357 174 7.0e-13
Macromolecule methylation -8.3169 211 2.8e-13
Ribonucleoprotein complex subunit organization -8.2828 182 7.5e-13
MRNA metabolic process -8.2476 481 6.8e-14
Cell cycle process -7.687 396 3.3e-12
RNA splicing -7.5065 265 3.4e-11
Ribosomal small subunit biogenesis -7.3335 80 3.9e-09
Ribosome assembly -7.2635 77 7.4e-09
DNA repair -7.2529 314 9.0e-11
Cellular response to DNA damage stimulus -7.2318 337 9.0e-11
Organelle fission -6.8164 226 1.8e-09
Nuclear division -6.7864 189 2.5e-09
TRNA metabolic process -6.5216 170 1.3e-08
RNA splicing, via transesterification reactions -6.4989 194 1.7e-08
RNA splicing, via transesterification reactions with bulged adenosine as nucleophile -6.4989 194 1.7e-08
DNA conformation change -6.2018 142 8.7e-08
Mitotic cell cycle process -6.1421 235 6.5e-08
Organelle assembly -6.1273 154 1.3e-07
DNA recombination -6.126 151 1.3e-07
Methylation -5.9282 364 1.5e-07
 pathwayListData.out = pathwayListData() 
 enrichmentPlot(pathwayListData.out, 25  ) 

  enrichmentNetwork(pathwayListData.out )  

  enrichmentNetworkPlotly(pathwayListData.out) 

## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
 input_pathwayMethod = 3  # 1  fgsea 
 fgseaPathwayData.out <- fgseaPathwayData() #Pathway analysis using fgsea 
## Warning in fgsea(pathways = gmt, stats = fold, minSize = input_minSetSize, :
## You are trying to run fgseaSimple. It is recommended to use fgseaMultilevel. To
## run fgseaMultilevel, you need to remove the nperm argument in the fgsea function
## call.
## Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (0.06% of the list).
## The order of those tied genes will be arbitrary, which may produce unexpected results.
## Warning in fgseaSimple(...): There were 8 pathways for which P-values were not
## calculated properly due to unbalanced gene-level statistic values
 results <- fgseaPathwayData.out  #Enrichment analysis for k-Means clusters 
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Direction GSEA analysis: Meristem zone 1 vs Elongation zone 2 NES Genes adj.Pval
Down Ribosome biogenesis -3.506 343 6.3e-02
Ribonucleoprotein complex biogenesis -3.3891 438 1.7e-01
RRNA processing -3.3385 239 2.6e-02
Ribosomal large subunit biogenesis -3.3217 99 6.9e-03
RRNA metabolic process -3.303 244 2.8e-02
RNA modification -3.2379 321 5.5e-02
Ribosome assembly -3.1058 77 5.9e-03
NcRNA processing -3.0606 357 7.5e-02
Ribosomal small subunit biogenesis -3.0603 80 6.0e-03
NcRNA metabolic process -3.0486 425 1.8e-01
Ribonucleoprotein complex assembly -2.9182 174 1.4e-02
Maturation of LSU-rRNA -2.8756 42 5.1e-03
Ribonucleoprotein complex subunit organization -2.8685 182 1.5e-02
Ribosomal large subunit assembly -2.8613 40 4.9e-03
Maturation of SSU-rRNA -2.8432 52 5.1e-03
Macromolecule methylation -2.7982 211 1.9e-02
RNA methylation -2.6617 64 5.4e-03
Maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA) -2.6365 36 4.9e-03
Ribosomal small subunit assembly -2.6281 31 4.8e-03
Organelle assembly -2.6238 154 1.1e-02
Maturation of 5.8S rRNA -2.6198 32 4.8e-03
Mitotic cell cycle phase transition -2.5546 76 5.8e-03
Regulation of mitotic cell cycle phase transition -2.5513 48 5.1e-03
Cell cycle phase transition -2.5336 81 6.1e-03
Mitochondrial RNA metabolic process -2.5208 55 5.1e-03
Meiosis I cell cycle process -2.5124 62 5.4e-03
Cleavage involved in rRNA processing -2.5124 24 4.6e-03
Maturation of 5.8S rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA) -2.5016 23 4.6e-03
MRNA modification -2.4782 43 5.1e-03
Regulation of cell cycle phase transition -2.4729 52 5.1e-03
  pathwayListData.out = pathwayListData() 
 enrichmentPlot(pathwayListData.out, 25  ) 

  enrichmentNetwork(pathwayListData.out )  

  enrichmentNetworkPlotly(pathwayListData.out) 

   PGSEAplot() # pathway analysis using PGSEA 
## 
## Computing P values using ANOVA

9. Chromosome

 input_selectContrast2 <- 'Meristem zone 1-Elongation zone 2'   #select Comparison 
 #input_selectContrast2 = limma.out$comparisons[3] # manually set
 input_limmaPvalViz <- 0.1  #FDR to filter genes
 input_limmaFCViz <- 2  #FDR to filter genes 
 genomePlotly() # shows fold-changes on the genome 
## Warning in eval(quote(list(...)), env): NAs introduced by coercion
## Warning in genomePlotly(): NAs introduced by coercion

10. Biclustering

 input_nGenesBiclust <- 1000    #Top genes for biclustering
 input_biclustMethod <- 'BCCC()'    #Method: 'BCCC', 'QUBIC', 'runibic' ... 
 biclustering.out = biclustering()  # run analysis

 input_selectBicluster <- 1 #select a cluster 
 biclustHeatmap()   # heatmap for selected cluster 

 input_selectGO4 <- 'GOBP'  #Gene set category 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO4,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 results <- geneListBclustGO()  #Enrichment analysis for k-Means clusters   
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
adj.Pval Genes Pathways
7.8e-131 174 Peptide metabolic process
4.6e-128 166 Peptide biosynthetic process
4.6e-128 182 Cellular amide metabolic process
2.5e-127 165 Translation
2.5e-127 171 Amide biosynthetic process
7.7e-106 208 Organonitrogen compound biosynthetic process
6.7e-48 70 Response to cadmium ion
7.1e-44 75 Response to metal ion
6.2e-43 77 Ribonucleoprotein complex biogenesis
1.8e-41 98 Response to inorganic substance

11. Co-expression network

 input_mySoftPower <- 5 #SoftPower to cutoff
 input_nGenesNetwork <- 1000    #Number of top genes
 input_minModuleSize <- 20  #Module size minimum 
 wgcna.out = wgcna()   # run WGCNA  
## Warning: executing %dopar% sequentially: no parallel backend registered
##    Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
## 1      1    0.381 1.890          0.903     904       936    949
## 2      2    0.316 1.080          0.864     845       893    916
## 3      3    0.302 0.910          0.885     799       856    890
## 4      4    0.288 0.763          0.877     761       825    867
## 5      5    0.241 0.628          0.814     728       796    846
## 6      6    0.291 0.722          0.851     698       770    827
## 7      7    0.223 0.589          0.846     672       745    810
## 8      8    0.242 0.569          0.885     648       723    794
## 9      9    0.211 0.489          0.865     625       701    778
## 10    10    0.207 0.448          0.827     605       681    764
## 11    12    0.220 0.477          0.928     569       644    737
## 12    14    0.210 0.451          0.908     537       611    713
## 13    16    0.231 0.430          0.966     508       582    691
## 14    18    0.228 0.429          0.956     483       553    671
## 15    20    0.204 0.405          0.924     460       528    653
## TOM calculation: adjacency..
## ..will not use multithreading.
##  Fraction of slow calculations: 0.000000
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.
 softPower()  # soft power curve 

  modulePlot()  # plot modules  

  listWGCNA.Modules.out = listWGCNA.Modules() #modules
 input_selectGO5 <- 'GOBP'  #Gene set category 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO5,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_selectWGCNA.Module <- 'Entire network'   #Select a module
 input_topGenesNetwork <- 10    #SoftPower to cutoff
 input_edgeThreshold <- 0.4 #Number of top genes 
 moduleNetwork()    # show network of top genes in selected module
##  softConnectivity: FYI: connecitivty of genes with less than 4 valid samples will be returned as NA.
##  ..calculating connectivities..

 input_removeRedudantSets <- TRUE   #Remove redundant gene sets 
 results <- networkModuleGO()  #Enrichment analysis of selected module
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
adj.Pval Genes Pathways
5.8e-138 207 Peptide metabolic process
7.9e-135 197 Peptide biosynthetic process
3.0e-134 196 Translation
2.2e-133 203 Amide biosynthetic process
2.8e-133 216 Cellular amide metabolic process
9.5e-104 247 Organonitrogen compound biosynthetic process
5.2e-54 92 Ribosome biogenesis
9.0e-54 102 Ribonucleoprotein complex biogenesis
4.8e-50 160 Cellular component biogenesis
2.3e-48 127 Response to inorganic substance